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SUMMARY. - . . 

Objective 

The SOPHIE program has now been extended to the point 
where a person capable of • troubleshooting an electronic 
device and capable of talking with a human tutor- should be 
able to carry out a meaningful dialogue with it . The 
modifications performed on its grammar processing allow for 
context dependent impressive specification of requests such 
as occur in a casuaT conversation between people working on 
devices in an 'electronics repair shop. The added' ability to 
request suggestions on. what might' be investigated next 
^corresponds closely to a request which a student might 
expect to make of a human instructor. The instructor 
looking at what the student sees as symptoms would direct 
him to several general possibilities and expect the student 
to isolate those possibilities^ 

Approach 

\ SOPHIE has reached a point where further development 
requires feedback from student "usage to be effective. This 
should be considered .in any further development effort. 
Feedback has thus iSac^^ been obtained from people , with 
background and skills exceeding that of the target trainee 
and is thus dnly partially ' relevant as an evaluative 
situation . 

Recommendation . ' . ' 

Not yet investigated is the potential for Use of a 
SOPHIE like program by instructional* developers as an 
authoring aid. It is clear now that programmed text on CAI 
material on the IP-28, powe^ supply could be very easily 
authored using SOPHIE as a communicative expert on how this 
circuit operates or even more directly as the producer of a 
data base from wh ich simpler paradigms than the 
mixed-initiative could be devised. In this sense SOPHIE 
should be thought of as the beginning of a new approach to 
authoring as well as a sophisticated form of instruction. 
The possibilities of using SOPHIE type technique's in the 
training of such areas as aircraft flight should also" not be 
ignored. The ramifications of an "intelligent" programmed 
expert carefully "watching" a student fly a simulator and 
telling him in detail where he failed in real time might 
prove to be very economically viable. 
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CHAPTER 1 ~ INTRODUCTION 



This report covers our research and development efforts 
on SOPHIE* for the six month period ending June 30 r 1974. 
The first part of the report reviews the overall goals of 
the SOPHIE project and then presents an annotated dialogue 
ot a student using the most recently released version of the 
'system. This dialogue' exhibits many o£ the features that 
have been developed during this six month period. Following 
the dialogue • we include a brief but self-contained 
description of the basic inferencing techniques which enable 
this system to achieve its question answering^ hypothesis 
evaluation^ and hypothesis generation behavior. It is 
intended that this first chapter contain 'sufficient detail 
that the reader .need not have studied'our prior final report 
in order to ^understand the remaining chapters. 

The second and third chapters of the report concentrate 
on the new features of SOPHIE. * Chapter 2 provides a 
detailed description of the theory formation or hypothesis 
generation abilities of this new y^^^ion. Although the 
original system performed some limited hypothesis 
generation, we have greatly expanded this module and/ more 
important r we have found a novel and powerful use fdr these 
extended capabilities. In the original verl^on* h^ypoth^sis 
generation was used soley to provide a studertt-^^ith • help 
when he ran out ^of '^viable ideas about what could be wrong 
with the instrument. During the last six months we have 
discovered that by making the hypothesis generation system 
"complete" (i.e. in the sense of it being able ho construct 
all single fault theories that are logically consistent with 
the known measurements) we can use it to verify whether or 
not ^ new measurement is logically redundant wjLth respect to 
the current or known set of measurements.. In other words we 
can use this module to determine whether or npt the n^xt 
given measurement could in any way add i^ew information about 
wh.at could 'be wrong by checking to see if it reduces the 
list of possible faults which the hypothesis generation 
system produces! This^ for example^ would enable us to 
automatically grade a student's seauence of measurements. 

Chapter 3 describes the major additions made ^o the 
natural* language front-end of SOPHIE. By extending our use 
of a semantic grammar, we can now handle utterances that are 
incomplete sentences or that involve the use of pronoun 
referentesr etc . . This new processor achieves this 
^capability by using the previous questions (i.e., those just 

*A SOPHisticated Instructional Environment 
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asked by the student) to establish a context for the 
dialogue^ Tl\i9 context is then used to make explicit any of 
the implicit information in the student's current question. 
We think that this new natural language processor makes, 
SOPHIE one gf the first sytems that has successfully coped 
with the unique problems of man-machine dialogue and as such 
it makes it one of the most "habitable" or friendliest 
systems around. 

SOPHIE 's Goals and Objectives 

SOPHIE represents a major step toward the goal of 
producing a "reactive" learning environment. In an ideal 
reactive environment the student is encouraged to explore 
ideas, create conjectures or hypotheses about a situation 
and then to receive imfnediate cietailed feedback as to the 
logical validity of. these ideas. In^those cases where his 
ideas or proposed solutions have logical flaws, the system 
creates relevant counter-examples or critiques so that the 
student can start to debug his ideas, in short, a reactive 
learning environment extends Carbonell's original concept of 
mixed-initiative Computer Assisted Instruction (CAI) to the 
point where *the student has^a one-to-one relationship with 
an "exper r^(system) which in some v/ays can surpass the 
'.inferential capabilities of most human tutors. Of course, 
creating a system that has both the depth and breadth of a 
human - tutor is far beyond the current stat<=^ of the art, but 
by carefully choosing a domain of knowledge for which we 
have extremely powerful infercncinq mechanisms, we can 
create an artificially intelligent "expert" system which can 
patiently provide tff& student with a logically deep sounding 
board for his own ideas. ' - 

SOPHIE was designed- to fulfill three main objectives: 
The first was t-O demcTnstrate \.hat the notion of using 
Artificial Intelligence . (AI) techniques to buiW an 
"intelligent" 'CAl'system (ICAI) was not purely a pipe dream 
but that in fact ♦a system could be built that was 
sufficiently complete and efficient that it could be used as 
f an exper imental . tool in a classroom environment. The second 

objective was to explore some new dimensions for CAI which 
exploit the s iqni f^icant" increase in computational power 
provided by current advances in hardware technology. It 
seemed fruitful to begin such an investigation today so thab 
we, are prepared ' to imaginatively utilize tomorrow's 
computers. The third was t^^ fulfill the need for an 
environment in which to experiment with new ways olE teaching 
problem solving skills, such as electronic troubleshooting, 
without being constrained to pose only problems having 
oxtonsionally defined solution sets. We wanted to allow the 
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student freedom in choosing the way in which he could go 
about solving his problem while still expecting our system 
to monitor all his decisions and provide ^im with useful 
feedback without our having to anticipate (and hence program 
in) his every move, query etc. 

The idea of using AI techniques in CAI was originated 
with CarboneJJ^ in his mixed-initiative SCHOLAR systems 
(Carbonell 1970, 1973).' Since then, other systems* for 
teaching ' symbolic logic (Goldberg 1973) , meteorology (Bro.wn 
et al . , 19,73) 'and the interpretation of nuclear magnetic 
resonance spectra (Sleeman 1974) have explored wa/s in which 
to augment the mixed-in*i ti:-tive system with considerably^ 
more problem solving and inferencing capabilities. 
Admittedly, much of the inc reased logicafl capabilities of 
these latter systems has been achieved at * the cost of 
restricting the kinds of generic knowledg^'e to* ^ be 
represented. However, this trade-off seems eminently 
reasonable since these latter systems are rtot tryir>g ' to 
mimic all the roles of a huma^n tutor. 



SOPHIE reflects a major research effort to produce a 
CAI system that, on the one hand, .produces deep logical 
inferences on a domain less formal than symbolic logic artd, 
on the other hand, is sufficiently complete that it' can 
answer nearly all questions posed to it by. a student. . To 
the extent that SOPHIE accomplishes this, it overcomes a' 
major limitation inherent in neanly all intelligent system^s. 
However, these capabilities are by- their very nature .complex 
c»nd as such require a sophisticated set of strategies and 
procedures. For examole, this kernel version of SOPHIE 
represents roproximately 300 ,000' words (36 bit) of INTERLISP 
and FORTRAN code running on -a virtual; memory TENEX. 
Althouqh it is an immense program, it is surprisingly 
efficient exhibiting a typical response delay of around 
three seconds on a -lightly loaded system ^d requiring on 
-the" average about two cou minutes per hour \f student use. 



Reasons for Choosing Electronic X^qubleshoqt in<j as SOPHIEjS 
Fjrst Domain "of 55.Pq£tise ^ 

There are several reasons that influenced our choice of 
electronic troubleshoot- ing as the subject domain around 
which to build this system. The first is that it provides 
an excellent domain for developing and experimenting with a 
reactive learning environment. For example with the use. of 
a simuU^tor a student can experiment with a circuit by 
modifying its various .components and' examining the 
consequences of these modifications. Within ^ the simulation 
context he can' uuickly ma'ke all kinds pf measurements (some 




of which . would ordinarily reauire the time-consuming 
operatioA of decoupling a component from the circuit) • He 
need never worry about limiting his experimentation through 
fear of blowing up the instrument • Indeed if this happens;, 
the student can be directly informed that his last 
experiment blew certain components, or. he ^could be told that 
something blew and be asked to troubXeshoot his own 
mis-doinqs. 

^ r 

The second, and by far the mo^t important reason fpr 
choosing this domain, is that the lab instructor, seldom has 
the time to answer the individual questions which arise 
while the, student is troubleshooting . ' Aico , the instructor 
doesn't usually have the time to have * each* student 
articulate the train of hypotneses that he is developing 
while troubleshooting. Conseauently , the instructor misses 
a crucial opportunity for providing the student »with 
detailed lo.qical analyses of the correctness of his 
hypotheses just when the student, is most likely to be 
interested , in such feedback. The reactive envir9nment 
provided by SOPHIE has sirfficient in£erencing capabilities 
to circumvent all these limitations. (Note that such 
inf erjencing , or deductive capabilities, represent far more 
potential \han just the obvious use of a simulator a? first 
mentioned.) ^ 

Basic Scenario 

The basic scenario underlying SOPHIE concerns a student* 
atteijiptlng to isolate a fau-lt in a given piece of electronic 
equipment while having a lab instructor standing over his 
shoulder to answer questions, evaluate his hypotheses and 
pose alternatives to him if he becomes stuck. 

' In the usual setting, SOPHIE presents the student with 
a ^icircuit schematic of the instrument under study and 
automatically selects and inserts a f-ault of some specified 
degree of difficulty. The student then tries to debug the 
instrument by requesting various measurements. At any time, 
he can^ offer a hypothesis about what he thinks co^ild be 
wrong with the instrument and receive an evaluation of his 
hypothesis, which specifies whether it is consistent with 
what he should have been able to infer from his 
measurements . The student can also, at any time , replace 
any component, but before a part is replaced, the student is 
queried as * to what he thinjcs is wrong with it. If his 
answers are correct the component is replaced. In those 
cases where he has discovered a part which was blown because 
of a deeper >fault, the replaced component will be reblown 
until he discovers and fixes the fundamental fault. If the 



student becomes stuck and cannot think of any faults which 
would explain his measurements, he can ask for help. SOPHIE, 
then -examines his measurements and generates plausible 
hypotheses (with respect to all the previous measurements) 
which he can explore. 

In order to illu^strate some of SOPHIE 'f capabilities, 
we present, a protocol generated by a' student using SOPHIE. 
In ' this session the student, takes some, measurements r 
develops ?an idea, is told }jhy that idea is wrong, gets 
stuck,* is given some new ideas^ and eventually solves his 
problem* ' • • . 

Annotated Dialocjue - 

The instrument currently^eing modeled is the Heathkit 
IP-28 regulated power supply wrtich is a reasonably 
sophisticated, six transistor power supply'wiPth both current 
and voltage limiting behavior. As such it, involve's two 
critical feedback paths which make troubleshooting this 
instrument fairly comple^. .' ^ 



PROTOCOL * ^ 

WELCOME TO SOPHIE, A REACTIVE LEARNING ENVIRONMENT. 

THE PIECE OF EQUIPMENT YOU ARE STUDYING IS' THE IP-2a REG'ULATED POWER 
SUPPLY. WOULD YOU LIKE A CIRCUIT SCHEMATIC?* . * 

> YES ** ' . 




*The schematic has been photo-reduced to save space and is included here 

to giv^ an idea of the complexity of the circuit. 

**Parts evped by the student are underlined. . 



NSTRUMENT 
HIGH 

1.0 

HIGH 



HAS 



VpULD ^YOU' LIKEC^TO START WITH A FAULTED INSTRUMENT V/HlClJ' NEEDS FIXING? 
> YES 
THE I 
CC^ 
CR 
LOAD 
VC ' 
VR 



A FAULT AND ITS PRESENT CONTROL SETTINGS ARE: 
(CURRENT COKTROL-FROM 0.0 (MIN CURRENT) TO 1.0) 
(CURRENT RANGE SWITCH, HIGH=1 AMP,»LOW=.i AMP)* 
(LOAD i^bSISTANCE IN OHMS) 

(VOLTAGE CONTROL-FROM 0 . 0^ (MIN VOLTAGE) TO 1.0) 
(VOLTAGE RANGE SWITCH, HIGH=30 VOLTS, LOW=10 VOLTS) 

Lt is assumed that the student has already had an introduction 
to ,the basic principles of the power Supply so the above 
comments are sufficient to characterize the controls. 



>> WHAT IS THE OU TPUT? 

THE OUTPUT VOLTAGE IS 11.7 VOLTS." • , 

>> WHAT IS IT IN A WORKING INSTR UMENT? 

IN A WORKING CIRCUIT THE OUTPUT VOLTAGE IS 19.9 VOLTS. 
THAT VALUE IS INCORRECT. , . • ' ' 

In case' the student doesn't know what the _val^e of a 
measurement should have been, he can have any measurement 
repeated in a working instrument. Note J:he student's use of 
the pronoun "it"% to refer to^he entice previous sentence. 

>>WHAT ^S THE VOLTAGE BETWEE N R8 AND QI ' . 

THE VOCtAGE at N1 IS 47.6 VOLTS. 



>>WHAT I^ THE BSE EMITTER VOLTAGE OF THE VOLTAGE LIMITING TRANSITOR? 

«BASE ' ^ . 

^TRANSISTOR 

THE BASE EMITTER VOLTAGE OF Q5 IS -.037 MILLIVOLTS. • ^ 

f 

The spelling corrector ! hars detected and corrected two 
misspellings. The systenj must also know that Q5 is the voltage 
limiting transistor. 

>>WHAT i^BOUT FOR THE CURRENT LIMITIN G''T ^NSISTO R 
THF^SE EMITTER VOLTAGE OF Q6 IS .349 VOCtS. 



The meaaing of this question is completely dependent * on the 
context estab*lished by the prior sentence. The system uses 
this context to determine that he is interested in not only 
voltage measuremerits • but the voltage across a particular 
junc^*'^nl 

>>WHAT SHOUL D IT,BE?>^ 

IN A WORKING'CIRCUIT Tire BASE EMITTER VOLTAGE 0? Q6 ^S .593 VOLTS. 



Another way of^ requesting the prior measurement to be repeated 



/ 



\x\ a working instrument, i*e*, the meaning of "should" implies 
that a' working instrument should be used* 

>> REPLAC E Q6 

I AM GOING TO ASK^ YOU SOME QUESTIONS ABOUT HOW Q6 IS FAULTED. 
ARE ANY JUNCTIONS SHORTED?^ 

Before the system replaces a component it queries him, as to 
• what he thinks is wrong. If ihe gets at least something right/ 
the 'system tells him what was actually wrong and goes ahead dnd 
replaces it.., ^ ^ 



/ 



/ 

/ 



> i don't know 

ARE ANY T^RMrNALS OPEN? , / 

> YES ' . e$ - ' ' 
WHICH TERMINALS ARE OPEN? 

> THE BASE ^ V 

Q6 -IS NOT FAULTED IN THAT MAtfNER. Q6 NOT REPLACED. 

>>IS VI POSSIB LE THAT Q6 HAS AN OPEN BASE 

The student Kbw has an idea what might be wrong. However 
before the systeip's evaluation procedures can be applied to his 
idea or hypothesis the hypoth6/sis must be completely sp.ecified. 
To 'help the student out, the system suggests a plausible^ 
complete specification Qf it. 

IS THE COMPLETE FAULT MODE EB/EC/BC = OP/OK/OP WHAT YOU INTENDED? 

> SURE 

WHeFTtHE settings were.: CC=1.0, CR=iHIGH, LOAD=20 , VC=1.0, VR=H^GH 

THAT FAULT CONFLICTS WITH TtlE FOLLOWING MEASUREMENT. 

YOU 'OBSERVED THAT THE OUTPUT VOLTAGE WAS 11.7 VOLTS. 

IF Q6 HAD FAULT ^^ODE* EB/EC/BC = OP/OK/OP IT WOULD BE 23.9 VOLTS. 

IN A WORKING CIRCUIT IT IS 19.9 VOLTS.- 

In determining the logical consistency o£ his hypothesis with 
^ reepect to the measurements he has already made, a glaring 

" coahter-example is ^discovered . ^ 

ALSO: 

THAT FAULT CONFLICTS WITH THE FOLLOWING MEASUREMENT. 
YOU OB'SeRVED that the base emitter VOLtAGE of Q6 was .349 VOLTS. 
IF Q6 HAD FAULT MODE EB/EC/BC = OP/OK/OP IT WOULD BE .-712 VOLTS. 
IN A WORKING CIRCUIT IT IS .593 VOLTS. a 

The system then points out a more subtle counter-^example . 

THAT FAULT DOES NOT EXPLAIN ANY OF THE SYMPTOMS YOU OBSERVED BUT IT IS 
CONSISTENT WITH THE VALUES FOR 
■THE BASE EMITTER VOLTAGE OF Q5 AND THE VOLTAGE P^T Nl. 
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Ana finally the system tells him what is right or at least 
, consistent with his 'hypothesis. 

^> WHAT IS THE OUTPUT OF THE CONSTANT CURRENT SOURC E 
THF^TPUT"^F THE^CONSTANT current source is -.59 MILLIAMPS. 

To 'understand this request the system must also know about the 
functional blocks of the instrument. Also note that the system. 
had to determine that "output" in this context meant "output 
current" as opposed to "output voltage" -as was meant in the 
first questions of this pro'tocoll 

»what sh ould it be? ^ 

inT^orking^circuit the output current of the constant current source 
is -.6 milliamps. - . . • 



>>V BETWEEN- N4 AND N5 

•THE VOLTAGE BETWEEN N4 AN© N5 IS 1.64 VOLTS. 

Noun phrase utterances, get interpreted as questions. 
>>COULD THE PROBLEM BE THAT THE BASE OF 03 IS OPEN? 

IS THE complete~faulF'mode eb7ec7bc~= opTokTop what you intended? 

> SURE ' ^ 

HMMM, LBT ME THINK ABOUT THAT FOR A MINUTE. 
UNDER 'THE SETTINGS: 

CC=1.0, CR=HIGH, LOAD=20, VC=1,0, VR=HIGH 
THE FAULT WHICH YOU PROPOSED WILL PROPAGATE. 
LIKELY FAULTS WHICH MIGHT OCCUR ARE: 
R22 OPEN 

WOULD YOU LIKE M2 TO CONTINUE TESTING YOUR HYPOTHESIS IN A CIRCUIT 
IN WHICH R22 IS OPEN ALSO? 

> NO 

The student proposes a hypothesis that ^ .logically entails 
another component being blownr Before further evaluation of 
his hypothesis, the user is querried as to whether he realizes 
this entailment. If he doesn't, he is given the chance to have 
his hypothesis evaluated in a hyp ot hetical or "ghost" world in 
which that component would not have bloyn (e.g. in this casp 
the power rating of R22 would have to be increased to a value 
greater than its power dissipation.) 

WOULD YOU LIKE ME TO COMMENT ON YOUR HYPOTHESIS AS IF NO OTHER PARTS 
HAD BLOWN?, ' * 

> 

WHEN THE SETTINGS WERE: 

CC=1.0, CR=HIGH, LOAD=20, VC=1.0, VR=HIGH, 
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THAT FAULT WOULD HAVE CAUSED THE FOLLOWING SYMP'i'OM, 
YOU OBSERVED , THAT THE VOLTAGE BETWEEN N4 AND N5 WAS 1,64 VOLTS', 
IF Q3 HAD FAULT MODE EB/EC/BC = OP/OK/OP IT WOULD BE 31.3 VOLTS, 
IN A WORKING CIRCUIT IT IS 19.9 VOLTS. • — ^ 



ALSO: 

THAT FAULT CONFLICTS WITH THE FOLLOWING MEASUREMENT. 

•YOU OBSERVED* THAT THE OUTPUT VOLTAGE WAS 11.7 VOLTS. ^ 
IF Q3 HAD FAULT MODE EB/EC/BC = OP/OK/OP IT WOULD BE 1.19 VOLTS.^ 
IN A WORKING CIRCUIT IT IS 19.9 VOLTS^ 

THAT FAULT DOES NOT EXPLAIN ANY OF THE SYMPTOMS YOU OBSERVED BUT IT IS 
CONSISTENT WITH THE VALUES FOR 

THE' OUTPUT CURRENT OF THE-CONSTANT CURRENT SOURCE, THE BASE EMITTER . 
VOLTAGE OF Q5 AND THE VOLTAGE AT Nl . ' 

>>SAVE * 

CONTEXT SAVED, TO RETURN TO IT TYPE 'RESTORE'. 

YOU NOW HAVE A WORKING INSTRUMENT WITH WHICH TO EXPERIMENT. 

The user has discovered more flaws in his understanding and is 
probably surprised by the high voltage between N4, and n5 that 
follows from his hypothesis (i.e. 31.3 volts) • • wishing* tK> 
explore what caused R22 to be blown and what lead to this 
unexpectedly high voltage that the current troubleshooting 
context be saved (i»e. all his m'easurements tf'hich arj^ the gist 
for the evaluation specialist)* He • can ijpw perform any 
experiments he wishes. ^ 

>>WHAT IS THE I THRU THE BASE OF Q3 
THE BASE CURRENT OF Q3 IS 47 MILLIAMPS. 

>>WHAT IS THE CURRENT THRU R22 WHEN IT I^ SHORTED? 
IF R22 WERE SHORTED 

UNDER THE PRESENT SETTINGS, THE CURRENT THRU R22 WOULD BE 61 MILlIAMPS. 
IN A WORKING CIRCUIT IT IS 1.8 MILLIAMPS. 

This is an example of a hypothetical or conditional question. * 

>>WHAT ARE THE SPECS OF R22 

THE SPECIFICATIONS OF R22 ARE: , 
VALUE — 510 OHMS ' 

POWER/RATING ~ .5 WATTS ' ' 

He now *asks a factual questions which he could have just as 
s well looked ,up. 



>> RESTORE 

PREVIOUS SITUATION RESTORED. 
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He. now at'^^-'^rs to understand why R22 blew when Q3 opened and 
wishes to get back to the t,ask of .troubleshooting. 

>>WHAT COULD BE WR ONG? ' ' ' - * 

HERE ARE SOME FAULTS WHICH EXPLAIN THE MEASUREMENTS YOU OBSERVED: 

THE BETA OF Q3 IS LOW^ , " «- 

THE BETA OF Q4 IS LOW 

'OR R22 HAS THE WRONG RESISTANCE 

He seems to have run out of ideas and requests some help. At 
this point .trhe hypothe'sis generation '•specialists" are-invoked. 
These ^procepures examine all the measurements he has thus far 
made, and ^;^en constructs a collection of hypotheses which are 
log4.callY consistent with the values of information from these 
. measurejfnents. ^ ^ * ^. 

>> WHAT WOULD THE OUTPUT VOLTAGE BE IF THE BETA OF Q3 WERE 8? 
IF THE BETA OF Q3 WERE 8 

UNDER THE PRESENT SETTINGS, THE OUTPUT VOLTAGE WOULD BE 10.5 VOLTS. 
IN A WORKING CIRCUIT IT IS 19.9 VOLTS. 

The user takes a wild guess probably to get a qualitati\je feel 
about how the beta of Q3 affects thing%. 

>>IS THE -EMITTER CURRENT OF Q4 CORRECT 

THE EMITTER CURRENT OF Q4 IS --61 MILLIAMPS. ^ ^ 
IN A WORKING CIRCUIT THE EMITTER CURRENT OF Q4 IS -49 MILLIAMPS. 
THAT VALUE" SEEMS TO BE INCORRECT. 

This question is in some sense"^ cheat . It really consists of 
two measiarements. If a cost were bedng assigned to his 
measurements or questions he would be charged accordingly. 

>>COULD R22 BE LOW 
. IsTf^KAY IF I USE 25 OHMS FOR THE RESISTANCE OF R22? 
' > YES 

THAT^FAULT AGREES WITH ALL OF THE MEASUREMENTS YOU HAVE MADE. 

Finally he generates a reasonable hypothesis, but this doesn't 
mean that R22 is necessarily lowl Just that R22 being low is 
consistent with what he should have been able to determine from 
his measurements thus far. 
k ^ 

>>WHAT IS THE RESISTANCE OF R22 _ 
THE MEASURED RESISTANCE OF R22 IS 25 OHMS. 

>> REPLACE R22 
R22 REPLACED 

NOW YOU SHOULD TAKE MEASUREMENTS TO SEE IF THERE- IS ANYTHING 
ELSE WRONG. 
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The system decides he mdst know what is wrong with R22 so it 
doesn't hassle him like it did when ^e asked for Q6 to be 
replaced. * » 

>>NOW WHAT IS THE OUTPUT . 

THE OUTPUT VOLTAGE ISTT^.S VOLTS. 

>>WHAT ABOUT W ITH A 10 OHM LOAD 
WITH THE LOAEK^.NOW SET TO 10 
THE OUTPUT VOLTAGE IS 10 VOLTS. 

>>WITH A 1000 OHMS 

WITH THE LOAD NOW SET TQ 1000 

THE OUTPUT VOLTAGE IS 30 VOLTS. 

Note the ability for the system to handle rather extensive 
ellipses where the context for "expanding** this phrase '•witlj^ 
1000 ohms" is two questions back. J 

>> WHAT WOULD HAPPEN IF R22 WERE SHORTED ? 

IF R22 WERE SHORTED 

UNDER THE PRESENT SETTINGS: 

CC=1.0, CR=HIGH, LOAD=10, VC=1.0, VR=HIGH ^ 
THERE WOULD BE NO CHANGE IN THE OUTPUT VOLTAGE (I.E. 3fi^ VOLTS). 

HOWEVER UNDER THE SETTINGS: 

CC=1.0, CR=HIGH, LOAD=10, VC=1.0, VR=LOW, 
THE OUTPUT VOLTAGE WOULD BE .623 VOLTS. 
IN A WORKING CIRCUIT IT IS 9.93 VOLTS. 

A good example, of what else is involved in handling 
hypothetical questions. Here the system first guesses that he 
is interested in output voltage since he didn't specify 
anything particular. ' Then it fails to find any change of the 
output voltage with the present load resistance. However, 
before it concludes that* R22 has.no effect itr searches some 
other knowledge about the circuit and decides that a better 
test case 9r exlqiple would be tec set the instrument up with a 
•heavier loadl Under this setting it does find a significant 
change. 

>>IS ANYTHING ELSE WRONG? _ 

NO , ' e * . 
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There are several points worth emPhasizinq about the 
dialogue. First, the student is not restricted to any 
oarticular order or number of measurements. This means that 
SOPHIE must be able to evaluate a student's hypothesis or 
qenerate a collection of theories about what is wrong 
relative to any possible set of measurements , control 
settinqsr loads, etc. that the student miqht decide to use^ 
In other words r" SOPHIE can't use ore-stored decision trees 
to help in any of these • loqical tasks and therefore must 
rely on powerful inferencing procedures. Second, the 
student in the dialogue is not a beginning electronics 
student. SOPHIE assymes that the.user^has the requisijte 
electronic knowledge of someone beginning' troubles'hoot ingj 
and is not Prepared to answer such Questions as "what is a 
transistor". Third although the hypothesis evaluation 
specialists refute his hypotheses -quant itatively , the 
evaluation actually occurs at a^ Qualitative level and only 
if inconsistencies are discovered which are the exact 
quantitative rami ficat ions- of his hypotheses presented to 
him. In the next section, we will discuss the mechanisms 
which allow SOPHIE to carry on dialogues. 

SOPHIE jaanifests most of its "intelligence" through its 
question answering and hypothesis evaluation and generation 
abilities. These abilities are achieved through a set of 
special ourpose inferencing procedures each of which 
performs a certairi class of inferences extremely 
efficiently. The centralizing cqjBPonent of thfe inferencing 
system is a simulation program modeling a "piece of 
knowledge" . which in this case is an electronic instrument*.'. 
The underlying idea of how simulation c\in be used to perform 
inferencing is both straight-forward and extremely oowerful. 
Let us first consider the 'oroblemi of answering a 
hypothetical Question (always with /respect to a* given 
circuit) of the form: 

^ "If X then Y?" 

where X is a oroposition about some component in bhe given 
instrument and Y is a proposition about its behavior or 
symptoms. An example of such a hypothesis mi^ht be: 

"If C2 is shorted, is the output voltage zero?" 
The answer to* the Question can be found by invoking the 
simulator: First the simulation model of the instrument 
,must ' be modified so that C2 ' is shorted (i.e., the 
proposition X must be made true on the model). Then the 
simulation of the modified model is executed. Since the 

*More V precisfely, it models a schema of e'lectronic 
instruments with one element of the schema being the working 
instrumelnt an3 the other elements representing various ways 
the inst\umeo)t can be faulted. 
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results of the sirRulation run cfontain all the consequences 
of^ the .Tiodif ication {C2 beinq shorted), the hypothetical 
conseouent (nhe output voltage beinq zero) is simply checked 
aqainst these simulation results. 

I 

The abbve paradigm skips* over several logical 
difficulties . concerning which boundary and/or inout 
conditions should be used for the simulation rurfs. * If rt is 
necessary ;to determine all the logically possible 
conseauences of a hypothetical modification, then the 
simulation /must,' in princiole, be run over a potentially 
infinite colLection of the instrument's control settings, 
et^. 'Whilie for most practical situations there are only a- 
finite numper of cases '*worth" considering, this number can 
still be quite large. It is clearly necessary to have ^n 
additional/ inferencinq mechanism which jcan determine what 
the worthwhile cases are for any particular Question. - This 
additional' mechanism must embody electronic knowledge of a 
X different; sort than is represented ijn the simulator. Thus, 
metaphorically, the simulator may be interpreted as creating 
examples / whereas this additional mechanism tries to • 
guarantee! that these examples will be useful. r 
I ' ' ' 

T+ie tasks that fit most sitfiply into the simulation 
paradigm- concern requests for measurements. It is through . 
this mechanism that SOPHIE -^Qan create the electronics 
laboratory within which the^student is working. Whenever 
the student requests a measurement, the simulation is called 
to comDute the voltage at every node in the 'sircuit. From 
these voltages procedural soecialists derive answers to 
.additional questions ' a.bout the current thcouqh any 
component, the resistance of a component, the power 
dissipation of a component, the beta of a transistor, etc. 
Whenever the student wishes to explore the circuit under 
different conditions, he can arbitrarily change^ the 
controls, modify any component, or -introduce his own •'faults, 
ftny /such changes get efficiently translated by other * 
procedural specialists into new simulation models.^ 

Hypot/iesis Evaluation:- 

The first sophisticated use of simulation concerns the 
task ' of hypothesis evaluation.'* Remember that hypothesis 
evaluation requires a technioue that can check the logical 
consistency of a hypothesis against the measurements the 
^^stu^ent has taken. For example, hypothesis evaluation is 
required when a student , after makiijg several measurements , 
develops an idea (hypothesis) about what is wrong, e.g., "Is 
it possible that resistor R9 is op,en?*\ 

■■ / ■• ■ . 



ro evaluate the given hyoothesis we must derive all of 
its logical consequences and tsee if any of these 
consequences conflict with information derivable from his 
measurements. If there are such conflicts, thqy must be 
Dointed out to the student as logical inconsistencies. In 
.idditionr evaluation should identify which' of his 
measurements directly support his hypotheses and which are 
independent of it. 

t 

The ovc^luation strategy makes extensive use of 
simulation in the following way: First, the simulation 
model IS modified so as to be consistent with' tfie given 
hypothesis, i.e., the fault hypothesized by the student is 
inserted into a working model. Then all, the student's 
me'asurements are repeated under this "hypothetical" model. 
For each measurement there are four cases that might occur. 
(1) The observed and hypothetical values may aqree. (2) Tlie 
observed value may represent a symptom (i.e., is wrong), 
while the hypothetical value-is normal (i.e., is corcoct) . 
In this case the fault proposed by the student does not 
account for this particular symptom. (3) The observed value 
may- be normal while the hypothetical value is wrong. In 
this case the proposed fault would have created symptoms 
which the student did not observe. Or (4) the observed 
value and the hypothetical value may both be symptomatic but 
not the same. In every case but the first, the student must 
be told 'how, the measurements, disagree. The student's 
hypothesis is consistent if all of his measurements fall 
into case (1) . 

The comp^arisons needed to separate the above cases 
reauire knowing not only the values of a measurement ^'in the 
hypothetical and malfunctioning circuits but also the ^alue 
in a working circuit as well. The value in a working 
circuit is used to determine when the other . two values 
differ qualitatively. For example, if the value that the 
student observed .was 25 volts and the value under the 
hypothesized fault was 30 v'olts the difference between these 
two may or may not be qualitatively significant. If the 
value of the q iven- measurement in the working circuit is 30 
volts, the proposed fault does not acqount^ for the lower 
voltage observed in the faulted circuit. However, if the 
working circuit voltage is 3 volts, the hypothesis is doing 
a pretty good job of explaining the behavior implied by this 
measurement. Therefore, in addition to using simulation to 
determine the above quantities, a metric -is involved to 
"judge" the qualitative distance between these values. The 
heuristic of using the metric to identify when two 
measurements significantly differ, provides a beautifully 
simple circumvention of the need for a "theory"^ of how and 
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why the instrument works. (See (Brown et al., 1974) for a 
complete specification of this process.) 

The values of the student s measurements in a workinq 
circuit are also used to separate those measurements which 
support his hyi^othesis from those that are independent ' of 
it. If the values under all three conditions (i.e., the 
working circuit, the faulted circuit, and the 
hypothesis-related circuit) are essentially the same, the 
information derived from that particular measurement is 
independent of his hypothesis. If the faulted and 
hypothetical values aqree but are different from the workinq 
value, the measurement supports his hypothesis. 

Simulation Models 

As we have seen, DC simulations form part of the basis 
of Sophie's understanding of electronics. We currently use 
two types of simultitors. The first is a general purpose 
circuit simulation package coded in FORTRAN, called SPICE 
(Naqel 1971,1973). The *other is a functional simulator 
■ written in LISP, which incorporates circuit dependent 
knowledge. 

There are many problems unique to our use*- of 
simulation. Eot example, methods of modeling a circuit 
which facilitate the insertion of faults had to be developed 
alonq with explicit models of faulty components. In 
addition, the faulting of "one component will very often 
overload one or more other components leadinq to fault 
oropagation. Such si tuat ions* requi re a special monitoring 
mochani^m' which "sits on top" of the general purpose 
simulator and looks at the results of each simulation to 
decide if and how additional parts would blow. In fact, 
this mechani53:n, by making successive calls to the simulator, 
grows a fault propagation tree which captures the causal 
chain of events of soveral oarts being blown by one initial 
fault. This tree also serves a data base' for some of the 
question answering routines. 
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CHAPTER 2 HYPOTHESIS GENERATION 



Whenever the student reauests help from SOPHIE, the 
hypothesis generation system is invoked • This system 
provides him with a list of possible faults which follow 
from or are logically consistent with the measurements he 
has thus far observed. The technique used is based on a 
brute force method which has been made more intelligent 
through the addition of several procedural specialists. We 
will *first describe the brute force method and then describe 
how we have circumvented many of its limitations. 

This brute force technique begins with a oredeter'mined 
list of all the possible faults which may physical ly . occur 
in instrument. Then, for each measurement the student 

has taken, each fault on the list is run through the 
simulator. If the simulated value , for each measurement 
under a particular fault* agrees with the observed value, 
then this fault is consistent with yhat the student has 
seen* Otherwise this fault violates at least one 
measurement the student has taken and is therefore removed 
from the list of "viable faults.'* 

This techn ioue has several obvious 1 imitations. The 
primarv limitation is that of speed. With the large number 
of faults in the IP-28 and with a slow simulator like SPICE, 
the time required to simulate all physically possible faults 
would be excessive! A more theo.'r et ical limitation is that 
some conoonents of the circuit havean infinite* number of 
fault modes and hence might in principle require an 
unbounded number of simulation runs. Two examples of this 
are o faulted resistor whoso value has changed or a faulted 
transistor whose beta has changed. 

The most interesting way that the speed limitation is. 
circumvented is by creating a specialist called the 
proposer. The 'Proposer first looks at the result of one 
measurement -and th^n uses that result to deduce* a .set of 
faults. This set of taults is'smaller than the entire set 
of all possible faults for the instrument but still contains 
all the faults which are consistent with that one 
measurement. In addition to this technique we have 
constructed a specinl puroose functional simulator, (for the 
IP-28) which runs betweeri 10 to 100 times faster^ than the 
general ourpose simulator, but which is less accurate than 
the qenecal 6\jroose ono. 



The limitation encountered with tho^e fault modes which 
have an infinite set of possible values is ci rcumvented. «by a 
creating another specialist called" the instant iator . .For 
e^ch component havinq such o fault mode, the * instantiator 
uses the observed output voltage to determine' the fault^ed 
value, of the component. It obviously encodes a lot of 
special knowledge about the circAit to' be able to carry out 
this task. : ' 

* ' ; * \ 

Both of th^ese specialists woi?k in conjunction with the 
third specialist called .the- re.finer . This specialist uses 
the fast simulator (-not SPICE) ^wrth routines^ which compare 
the observed faulted- value. with the value produced by the 
proposed fault. •If these .two* valuers .do not agree, then this 
fault' is D.emoved ^frorn ;.th.e^ list of possible faultg. It 
therefore refines the list* of- .faults Vhich have been 
prooosed by the proposer. . . ' . * 

I^nter action of the Three Special i?ts 

Let's take the"^ following examole. . Assume that, the 
student, has taken only one measurement, the output voltage, 
and has found it to be low. At tihis ooint he asks, fqr help. 
Th(^ proposer * first examines the value observed for the 
outout voltage and from that deduces a* list of possible 
faults which could explain that measurement. It performs 
this task by using a set of procedures which , encode 
sufficient circuit-dependent knowledge to be able to use 
only the output voltage ^nd the settings, and from this 
information proposes a list of all the possible faults which 

.are consistent with the observed output voltage. However, 
this list may al'-o include some spurious faults which do not 
cause the observed behavior. The instantiator ^ow qoes^ td 
work. On those faults which reouire a value (qn example in 
this case would be the beta -of Q3 bei.ng low) , the 
instantiator determines what this value , should be. The 

t^refiner then runs each proposed fault through the fast 
simulator, to confirm that the value of the output^vol tage 
that the simulator obtained is the same as that which the 
student observed,. This refinement specialist thereby rules 
out those faults which were spurious or over-general. 

The Proposer • , • . ' 

The pro'pospr is used to generate a set of probable 
faults. This specialist must propose every fault which can 
explain the behavior observed by tho student. It also can, 
and loOs, or6posc faults which do not produce the honavior 
the student has^seen. Tho'^c ar<^ not excluded because th'e 
proposer dpes not take into account every nuance of the 



circuit* To tak3 every nuance into accounfr would ^ roouire 
rules which are too complex and, in addition, these complex 
rules may" not even be able to be determined. The rules used 
by , the , proposer are reasonably simple and fast. For 
examoTe, when the output voltage is essentially zero volts 
there is one group of possible faults including Q3 being 
open, R16; being open and G5 toeing shorted. When the output 
voltage is 0.6 volts less. than what it should be, the fault 
D$\being shorted is generated. There are - approximately a 
doz^n such rules for the IP-28 instrument.. 

'The proposer uses onl^ the value of the output voltage. 
Additional proposers could be. written; however, oi?e 
proposer would be reouired far each possible measu.remont in 
th.e circuit. Since good troubleshooting techniques reauire 
the student to take external ^ measurements before internal 
measurements, our single oroposer works quite well. Of 
course this means that the hypothesis generation facility 
cannot be used untiT an external measurement has been taken. 

When the proposer is presented with an output voltage 
that is not symptomatic, i.e., the same as that in the 
working circuit, it can still generate possible faults. 
However, this list can.be quite large. , Certain faults act 
normally under most settings. For examole, unless the load 
is sufficiently low, the fault of Q6 being open would not be 
detectable. Therefore, this fault cannot be* ruled out until 
sdrrte syiniiptomatic behavior is seen or until the settings have 
•been changed such that a symptom should have been seen ^nd 
wasn't. 

The student could ask for the outputs vol tage at several 
settings before asking for help. The proposer then m^kes a 
list of possible faults for each setting. ' Only faults which 
ate on all, the lists are then considered. 

The Instantiatqr ' . 

There is a class of faults (suggested by the proposer) 
which have left unspecified a specific value for the fault. 
An example is for a transistor to have a changed bet?. 
Here, ^ the proposer claims'the value of beta has changed but 
has not specified what the changed 'value is. However, 
before that fault can be simulated a specific value must be 
chosen. It is the job oi' the instantiator to determine such 
values. 

The instantiator uses two techniques to determine the 
value for such faults. One technique is to directly 
calculate the value. An example is that of Q3 *havinq 
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incorrect beta. The instant iator ^proceeds by dividinq- the 
output current by the amount of cur-tent coming out of the 
Constant Current Source to determine the current beta of the 
Darlington section. .This is then divided by the beta of Q4 
to determin"g""^e beta of Q3 . This works because the values 
for the Constant Current Source and the beta 6E Q4 are 
assumed to be correct since only one component is faulted in 
the circuit. \ , 

The second technique involves approx ijna tion using 
"forward'* functions. In certain cases one cannot determine 
the "inverse;' function that translates the output voltaqe 
into the component value as was done for the beta of'QS 
above. One. can only determine the output voltage^ from the 
value 'O'C . the component. The maximum and minimum values of 
the component are tried using the appropriate forward 
function. If the output -value is not inside the range 
generated by the f unction / then it' is not a possible fault. 
If it is inside the range the value at the ^'midpoint is tried 
next. This binary search continues until the correct inpqt 
value which causes the output value is found (within some 
percentage) . This requires that the forward function being 
us^sd is monotonic. In actuality, the forward functions only 
need to calculate the value at the outside of ^ their 
functional block.' Inverse fur^ctions arejjsed-to det^j^4;mine 
the Junctional block's value froii> the output 'Vtrfrua^^''^ — 

The above description assumes* that an incorrect or 
symptomatic output voltage had been used. When dealing with 
a correct or non-symptoraat ic output ' voltage, less 
information about ^hat the value should be is known. Only 
the upper bo.und* on the Value for the f.unctional block can be 
determined. An example is that of Q3 having insufficient 
beta.. » With a sufficiently large load resistor, the ' output 
voltage will b§ correct. There will be a point,^, however, 
where the beta will be sa low that incorregt output voltage 
would have resulted. Therefore, if the transistor Q3 ix^ 
faulted in that manner and the output .voltage is correct, 
the beta must be above that cut-off ooint. 

i 

/ If a component has been ins^E^^nt iated more th^n once, 
then it is possible that that fault should be removed from 
^consideration. If both instantiations were based on 
symptomatic output voltages , then two exact N/alues 
result for the comoonent.* If these tw^^alues disagree then 
the fault is ruled out. 

^ The second case is when fehe instantiations were based, 
one on a correct output voltage and the other on an errored 
one. If the exact value does not fail in the range oroduced 
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by the upper boundr then the fault is ruled out. If both 
irrscantiations were based on correct outout Voltaqes, then 
two upper bounds resulted. The lower for those two upper 
bo.UHvis is taken as the new upper bound; . - 

'fhe 'Ret iner . * / • . - 

Tt^e job of the refiner is to eliminate the spurious 
faults suq^qested by the proposer. The refiner., is not 
limited to looking only at output; vol tages . It allows the 
comparison of^ any measurement that the fast simulator can 
• ^jenanecate. It takes a fault from^ those suggested by^ the 
orcroser and a measurement (with'a particular 'setting) that 
the student had, previously taken and .runs it through the 
fasc simulator • ^ It- then compares the value from* the 
'simulator with the value the' student observed. If .the 
. 'tietric determines that the two values are not equivalent, 
then the fault is removed from further consideration. 

' The metrTc^used for the comparison is the same as that 
used in hypothesis evaluation."' It is described in detail in 
(Brown et al., 1974). This metric uses three values . — the 
value the student observed, the value produced -^by the^ 
simula'tor for the Cault^.being explored, and the value in a 
'working circuit. The tolerance used, to compare, the 
student 's value ^and the simulator's value is proportional to 
the • difference between these two values and the working 
circuit's value. 

The value the stude^lt observed is re-determined by the 
fast simulator rather than SPICE. (The other' two values 
were determined by the fast simulator previously.). This 
would be unnecessary except thak the fast simulator is not 
as. accurate as SPICE and a comparison between these three 
values is more meaningful when determined* b^ the same 
'simula^r. 

There is one case where the SPICE value must be used 
for the student's observed value. This is when the fast 
simulator cannot simulate the fault that is in the ^circuit. 
This occurs when there is a multiple component fault in the 
circuit. The hypothesis generation system, unable to 
generate multiple faults, will then generate those 
non~multiole faults which mimic the. behavior the student 
'^observed^ The SPICE simulator value must of necessity be 
uced sijnce the fast sim'ulator cannot handle multiple faults. 

The fast simulator described in Brown et al., 1974 only 
found the values at ^ the outside of the functional blocks 
{e.g., the constant cur rent 'source) . * This simulator has now 
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been augmented by addinq a specialist for each functional 
block which can determine values internal to its functional 
block. when a measurement is taken inside a oartic\jlar 
functional block, its specialist examines the result of the 
fast simulation and determines *^the values of the 
measurements internal to that functional block. Not all the^ 
specialists are invoked for each Simulation run. Orily those 
needed are invoked. 

^Measurement Verification ^ 
„ — J- 

After a student makes a measurement, we woi^ld^ai^'l ike"* to 
tell him whether- orO not it was a reasonable measurement, 
that is, ,v/hether it eliminates one or more possible faults. 

Before the measurement is determined, the sy^^tfm 
internally calls the hypothesis generation system^to^ 
determine the list of possible faults at that point in time.' 
The * measurement is then determined. Aqain the hypothesis 
generation system is called interna'lly and a second^ list of 
possible faults is obtained^ which now takes into 
consideration all the old measurements plus this new one. 
These two lists are compared. If they»-are the same, tben 
the student's measurement did not narrow down the -list of 
•possible f^'ilts and could tl^en be considered unreasonable! 

In the current implementation we type out the message 
that the measurement was useless. If there was only one 
fault remaining on the list of possibilities before he took 
the ^ »}ieasurement , the, student is tolcJ^that he has enough 
informatiqn to uniquely determine the fault. The system 
then refrains from printing any more "useless measurement" 
messages to keep from further confusing the obviously 
confused student. 

"Another imolementat ion being considered is to take a 
more positive approach and to print out a list of faults 
that have been removed from the possibility list after he 
takes a measurement. Unlike the first approach, that gave 
him only negative feedback, that is,' that. he did something 
wrono, this approach may prove less discouraging. 

The information as to the faults that have been ruled 
out by his measurement is v^aluaBle to the student. If the 
first approach is used, wc intend to allow the student to 
obtain this information by asking for it. 
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CHAPTER 3 A SEMANTIC5ALLY CE>QTERED PARSING SYSTEM 



When we started building* SOPHIE's* natural language 
processor v;e knew that it had to be both efficient and 
understanding of informal speech", i.e. "friendl/*. If 
either of these criterion were not metpr letting th.e student 
use natural language wouU be an obstacle rather than an aid 
) to the instruction proces.s. If the natural language 
/ processor were not efficient, the student would lose 
interest while waiting for his question tp parse. If it 
were not friendly, he would get frustrated trying to find a 
way to express his ideas. 

While the problems involved in building an efficient 
system are weil-kaown, those involved in a truly friendly 
system are not. We quickly discovered that when students 
use a system which, exhibits "intelligence" in its deductive 
and inferencing capabilities, they start to assume that^the 
system ^ must also be intelligent in tts conversational 
abilities as well. Fcf? example they would firequently delete 
parts of their^statements which they f-plt would be obvious 
given the context of the preceding statemerits. This 
included" the use of such linguistic- ohenomena as 
pronominalizations, anaphoric deletions and ellipsis. This 
led us to concentrate during the last six months on the 
semantic oroblems ^of dialogue. 

Annotated ^i 3 league <^ 

Before we discuss how S^)PHIE's natural language 
oroce'ssor works, we present an annotated dialogue which is 
intended to demonstrate SOPHIE's new linguistic 
capabilities.*. Even though the dialogue necessarily deals 
with electronic iargon, the 1 inguistic ^issues it exemplifies 
occur in all domains. The annotations (lower case, 
indented) attemot to point out these problems and should be 
understandable to the non-electronics or iented reader. In 
the dialogue the student's typing is underlined. 



*The reader is encouraged to sec (Brown, 1974) for further 
examples and descriptions of SOPHIE's tutorial and 
inf ^ront ial capabil itics. 
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>>INSERT A FAULT 



The student tells SOPHIE to give him a fault which he can 
troubleshoot • SOPHIE randomly selects a fault, inserts it into a 
model of the instrument and tells the student how the controls 
are presently s*et. 

THE INSTRUMENT HAS A FAULT AND ITS PRESENT CONTROL SETTINGS ARE: 
CC 1.0 (CURRENT CONTROL-FROM 0^0 (MIN CURRENT) TO 1.0) 

CR HIGH (CURRENT RANGjE SWITCH, HIGH = 1 AMP,^LOW=.l AMP) 

LOAD 1000 (LOAD RESISTANCE IN OHMS) 

VC- 1*0 (VOLTAGE CONTROL-FROM 0.0 (MIN VOLTAGE) TO 1.0) 

VR HIGH (VOLTAGE RANGE SWITCH, HIGH=30 LOW=10 VOLTS) 

>>WHAT IS THE OU TPT? 
=OUTPUT 

THE OUTPUT VOLTAGE IS 1.2 VOLTS. 

SOPHIE corrects the student's typing^mistake . ^ 
Notice that "the output" was clarffied as being the output 
voltage. 

IF YOU ARE NOT SURE WHETHER A^ VALUE. IS CORRECT^ YOU CAN ASK 'iS THAT 
Vi^LUE CORRECT?' / ^ 

"Correct" here means what it is in a working insti;ument. 
>>IS THAT RIGHT 

IN A WORKING CIRCUIT THE OUTPUT VOLTAGE IS 30.1 VOLTS. 
THAT VALUE IS INCORRECT. 

"That" here refers to t^e value of the previous measurement. 
SOPHIE, in addition.-^to the faulted instrument, has a working 
circuit which can be used for reference. 

>>WnAT IS THE VOLTAGE BETWEEN THE CURRENT LIMITI NG TRAN SISTOR 

...AND THE CONSTANT CURRENT SOURCE?" 
THE VOLTAGE AT N4 IS 2.48 VOLTS. 

In addition to specific part names, SOPHIE recognizes references 
to parts by the their section. Conjunction is only handled in 
limited cases. Notice that even though the sentence says 
"voltage between x and y" , it really moans "voltage at z with 
respect to ground'*. The output routines reduce the input to its 
simpler form (i.e. between x and y reduces to N4) . 



>>WHAT IS IT BETWEEN THERE AND THE BASE OF Q6 
THE VOLTAGE BETWEEN N4 AND ^N5 IST.JJ VOLTS. 

Here "it"' refers to "the voltage" and "thereT refers to N4*' 
Also notice. -that "between" to be treated differently, 
i.e*r two nodes are explicit now. ^ 

>>Q5? 

THE VOLTAGE BETWEEN N4 AND N10 'IS 1.84 VOLTS.. 

An-ellipsis. The student in effect meant' "what is the voltage 
between N4 and the base of Q5?. 

>>WHAT IS THE BASE CURRENT OF Q3» 
THE BASE CURRENT 0F"'q5^IS 12 MICROAMPS 

SOPHIE can answer questions about current, resistance, and power 
dissipation as well as voltage. The output and input routines 
know about unit prefixes. 




>>THRU THE EMITTER? • - 

THE EMITTER CURRENT OF Q3 IS -.26 MILLIAMPS. 

Another ellipsis. Notice that the referents cannot be figured 
out on syntactic grounds and yet this is the sort of statement 
student might make if the .last answer had caused a slight 
shifting of his thinking. \ 

^ \ 

>> COULD THE PROBLEM BE THAT Q5 IS BAD? 

PLEASE BE MORE SPECIFIC^ABOUT HOW Q5 IS FAULTED. 

Th^ student develops- an idea about what is wrong and asks to have 
it evaluated. 



>> COULD IT BE SHORTED? 

THE SHOTTED MODES OF A TRANSISTOR I HANDLE ARE; ALL JUNCTIONS 
SHORTED OR ONE JUNCTION SHORTED WITH THE OPPOSITE TERMINAL 
MODELLED OPEN. 

A hypothesis must be completely specified (or non-ambiguous) 
before evaluation can take place. 

Even , though the last statement generated an error condition, it 
'is still used to determine possible pronoun referents. 

SHOULD Q5 HAVE FAULT MODE EB/EC/BC = SH/SH/SH? 



> NO • ■ 
WHICH' TERMINAL SHOULD BE MODELLED OPEN? 

> THE BASE 

OK, I'll use eb/ec/bc = op/sh/op 

At thTs point, SOPHIE looks back over the student's measurements 
to determine if, he should have "known better S^e (Brown, 1974) 
for further description. 

HMMM, LET ME THINK ABOUT THAT FOR A MINUTE. ' ( 



WHEN THE SETTINGS WERE; ' 

CC=1.0> CR=HIGH, ■LOAD=1000, VC=1.0 , VR=MGH, . 
THAT FAULT CONFLICTS WITH THE FOLLOWING MEASUREMENT. - 
YOU OBSERVED THAT THE VOLTAGE BETWEEN N4 AND N5 WAS 1,28 VOLTS. 
IF/Q5 had fault mode eb/ec/bc = OP/SH/OP IT WOULD BE 1.1 MILLIVOLTS. 
yi A WORKING CIRCUIT IT IS 1.43 VOLTS. ^ 

BUT THAT FAULT DOES EXPLAIN THE VALUES YOU OBSERVED FOR 

THE OUTPUT VOLTAGE, THJ3 VOLTAGE AT N4/ THE VOI^TAGE BETWEEN N4 AND N10, 
THE EMITTER CURRENT OF Q3 AND THE BASE CURRENT OF Q3. 



>>WHAT COULD BE WRONG 

THERE ARE~LOTS OF FAULTS WHICH WOULD EXPLAtN THE MEASUREMENTS YOU ^ * 

HAVE TAKEN. HERE ARE SOME POSSIBILITIES: 

THE RESISTANCE OF R8 CHANGES- 

THE LEAKAGE RESISTANCE OF C4 ^CHANGES 

THE BETA OF Q3 CHANGES » 

R15 IS OPEN 

R15 IS SHORTED 

Q6 IS FAULTED 

The student becomes stuck and asks for assistance. SOPHIE 
proposes sortie possible hypotheses for him to explore. 



>>!£ R8 WERE 30K, WHAT WOULD THE OUTPUT VOLTAGE BE? 
IF THE VALUE OF R8 WERE"l0000 

UNDER THE PRESENT SETTINGS, THE OUTPUT VOLTAGE WOULD BE .904 VOLTS. 
IN, A WORKING CIRCUIT IT IS 30.1 VOLTS. 

The student doesn't believe one of the possiblities so cfiecks it 

using a hypothetical statement/ Even though the voltage in this 

case and that in the faulted circuit are not the same, they 
qualitatively agree. 



>> VOLTAGE DROP ACROSS R5 

THE VOLTAGE ACROSS R5 IS -.028 MILLIVOLTS.. 




Noun phrase utterances are treated as questions. 
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>>WHAT ABOUT R3? ' 

THE VOLTAGE ACROSS R3 IS 71.6 VOLTS, 

Context of previous question is used to determine ^'whaf about 
R3. 



>> REPL ACE R3 

I AM GOING TO ASK YOU SOME QUESTIONS *ABOUT HOW R3 IS FAULTED. 

IS R3 OPEN OR SHORTED OR DOES IT HAVE THE WRONG VALUE? 
> OPEN /) 

R3 REPLACED J< ^ ^ 



»1S THERE ANYTHING ELSE WRONG 
NO 

o 

After fixing the fault, the student uses SOPHIE to explore o*:her 
aspects of the circuit's behavior. ^ 

>>IF C2 WERE LEAKY, WHAT WOULD THE VOLT AGE ACROSS IT BE? 
IS~T OKAY IF I USE 10000 FOR THE LEAKAGE RESISTANCE OF C2? 
> YES ' ' ^ 

IF THE LEAKAGE RESISTANCE OF C2 WERE 10000 

UNDER THE PRESENT SETTINGS, THE VOLTAGE ACROSS C2 WOULD BE 26.3 VOLTS. 
IN A WORKING CIRCUIT IT IS 28.9 VOLTS. : 

>>WHAT HAPPENS IF CI SHORTS? ' 
UNDER THE SETTINGS: 

CC=1.0, CR=HIGH, LOAD=1000, VC=1.0, VR5=HIGHr 
IF CI WERE SHORTED, D3 WOULD BECOME OVERLOADED. 

In the case where nothing more interesting happens, "what 
happens" questit>^ns default to the output voltage. 

>>WHAT IS THE OUTP UT VOLTAGE WH,EN THE VOLTAGE CONTROL IS SET TO .5? 
WITH THE VOLTAGE CONTROL NOW SET AT .5" 
THE OUTPUV VOLTAGE IS 15.6 VOLTS. 

• 

A hypothetical question can involve the changing of a control. 
>>IF^ IT IS SET TO 

WITH THE VOLTAGE CONTROL NOW SET AT .7 
THE OUTPUT VOLTAGE IS 21.4 VOLTS. 

Context determines which measurement should be taken and which 
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control should be set. 



>>.9? . 

WITH THE VOLTAGE CONTROL NOW SET AT ,9 
THE OUTPUT VOLTAGE IS 27,2 VOLTS • 



>>WHAT IS THE CURRENT THRU THE CC WHEN THE VC IS SET TO 1^ 

WITH THE VOLTAGE CONTROL NOW SET AT l70 

THE CURRENT THRU THE CURRENT CONTROL IS 30 MILLIAMPS, 

CC and VC are acceptable abbreviations. * 



>>WHAT IS IT WITH IT SET TO 

WITH THE VOLTAGE CONTROL NOW SET AT ,8 

THE CURRENT THRU THE CURRENT CONTROL IS 24 MILLIAMPS, 



Here it must correctly instantiate two occurrences of "it**. 

>>GOODBYE 
GOODBYE 



Use of Semantic Information During Parsing 

In order to bring semantic knowledge* to bear on the 
problems of understanding a student's statement, the grammar 
from which. the understanding program works is structured 
around semantic or conceptual categor ies rather than 
syntactic ones. This means that SOPHIE's parser, which 
operates in a top-down left-to-right manner, is working with 
conceptual entities. Before we discuss the actual 
mechanisms by which parsing occufs or the exact form of the 
"semantic" grammar , we. will present some examples of how 
information is used during parsing. 

Semantic Prediction in the Grammar: 

One way that semantic information is used is to predict 
the/ possible alternatives that must be checked at a given 
point . Consider for example the operat ion of the parser 
when given the phrase- "the voltage at xxx" (e.g., ^the 
voltage at the junction of the current limiting sectipn and 
the voltage reference source."). When it discovers the word 



♦Semantic information refers to all knowledge. No 
distinction is made between semantic , conceptual or 
pragmatic knowledge. 
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"at" in its top-down, lef t-to-r iqht parse, it uses semantic 
information associated with the concept ''measurement" to 
predict the nature of ''xxx" , i.ev'/'^it should be a noun 
phrase specifying a location (node) in the circuit. If 
"xxx" is not a location, the parser^ can give up without 
trying to treat "xxx" as any other possible noun phrase. 
This is similar to th? ef feet ^ of using semantic markers. 
However, a system using semantic markers would parse "xxx"" 
as a noun phrase and then check for the marker +LOCATION. 
This would mean that ma'rker checking could only b^ done 
after the work had been *done to parse the noun phrase. 

This same predictive information is also used to aid in 
the determination of referents for. pronouns. If the above 
phrase were "the voltage at it", the grammar is able to 
restrict the class'of possible referents to locations. By 
taking advantage of the available sentence context to 
restrict the class of possible referents, the simple rule of 
* "last mentioned acceptable object" works in a large number 
of cases. Consider the sequence: 

(la) Set the voltage control to .8? 

(lb) What is the current thru R9? 

(Ic) What is it with it set to .9? 

In (Ic), the grammar is able to' recognize that the first 
"it" refers to a measurement that the student would like 
re-taken under slightly different conditions. The grammar 
can also determine that the second "it" refers either to a 
potentiometer or to the load resistance (i.e.,vOne of those 
things which can be'set). The referent for the first "it" 
is the measurement taken in (lb). The referent for , the 
second "it" is "the voltage contr^ol" which is an instance of 
a potentiometer. The mechanism which selects the referents 
will be discussed later. 



Using Local Information For SiavpW Omission: 

Another capability of predictive semantic knowledge 
during parsing is the recognition of simple omissions. The 
parser knows, for each conceptual entity, the nature of its 
constituent concepts. When^it is looking for an occurrence 
of a conceptual entity and cannot find an occurrence of a 
constituent concept, it can either fail -(if the missing 
concept is considered to be obligatory in the surface 
structure) or hypothesize that a omission has occurred and 
continue. 

For example, the concept oE a TERMINAL has (as one of 
its realizations) the constituent concepts of a 
TKRMINAL-TYPE followed by a PART. When the parser is 
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looking for the concept of a TERMINAL and only finds the 
phrase "the collector", it uses this information to oosit 
that a par.t has been omitted (i.e., TERMINAL-TYPE qets 
instantiated to "the collector" but nothing gets 
instantiated to ' PART) . In addition, it uses the 
dependencies between the constituent concepts to conclude 
that the omitted PART is a TRANSISTOR. 

Since the parser is able to recogni ze things that have 
been .left out, it can also sometimes fill in the missing 
piece. In the statement "change R9 to 1000", thQ parser 
notes that the units to be changed have been ommitted and 
decides that the student meant^J^^hr ^ J .nstead of "watts". 

Not all missing information is filled in by the local 
rules of the grammar. Given the question "What is the 
output?" and using the knowledge that a measurement needs a 
quantity and a place to measure it, the parser recognizes 
that some information has been omitted. Unlike the previous 
example, however, the missing information may be 
context-dependent. While, in most cases, the student means 
'•what is the output voltage of the power supply?", if his 
previous question were "What is the input current of the 
Darlington Amplifier?", this interpretation is open to 
question. For this reason the decision as to the proper 
defaults is left to the procedural "specialist" in charge of 
calculating the answer 'to various kinds of measurements. 

Recognition of Ellipses: ^ 

A third use for predictive semantic knowledge is to be 
able to accept elliptic utterances. These are utterances 
which do not express - complete thoughts* but only give 
differences between the underlying thought and an earlier 
one. In this sequence, '(2b) and (2c) are elliptic' 
utterances. 

(2a) What is the voltage at Node S'? 

(2b) At Node 1? 

(2c) What about between nodes 7 and 8? 

The parser is aware of which constructs and which concepts 
are frequently used to contrast complete thoughts and 
recognises occurrences of them as ellipses. While the 
parser is able usually to determine the intended concepts 
from the context available in a elliptic utterance, this is 
not always the case. Consider the following two sequences 
of statements. 



*Note that the no.tion of a comolete thought deoonds very 
much on the domain. For ou<r purposes, a complete thought is 
a completely specified question or command. 



(3a) What is the voltage at Nodr^S? 
(3b) 10? < • 

(4a) What is the output voltage .if the load is 100? 
(4b) 10? 

In •(3b), "10" refers to node 10, while (4b) refers to a load 
of 10. The problem this presents to the parser is that the 
concepts underlying these two elliptic utterances have 
nothing in common except the same surface realizations. The 
parser, which operates from conceptual entities, does not 
have a concept which includes both of these interpretations. 
One solution would be to have the parser find all possible 
parses (concepts) for a statement and then to choose between 
them on the basis of context. Unfortunately a great deal of 
time and effort is spent with such a method. 

Capturing , Semantic Information in the £££ser 

To enable the parser to use the semantic constraints, 
we have replaced the usual syntactic categories such as 
noun, noun phrase, verb phrase, etc, with semantic 
categories. These semantic categories represent conceptual 
entities known to the system, such as "measurements", 
"circuit elements", "transistors", "hypotheses", etc. While 
such refinement can lead to a ohenomenal proliferation of 
non-terminal categories in a grammar, the actual number 
involved is direc±ly related to the number of underlying 
concepts which can be discussed. For SOPHIE 's present 
domain, there are approximately fifty such concepts. 

The grammar which results from this refinement is a 
formal specification of the constraints that exist between 
the concepts that are of interest to the parsing process. 
The. semantic grammar captures the ways of expressing -a 
concept in terms of constituent concepts. Each rule also 
provides explicit information concerning' which of its 
constituents concepts can be deleted or pronominaliz^d . 
Part of the semantic grammar underlying SOPHIE 's dialect 
abstracted as context free grammar is provided in 
Appendix A. Once the constraints have been formalized into 
the semantic grammar, the grammar is encoded as procedures^ 
In this way, additional information peculiar to the 
recognition of ? concept can be encoded in the corresponding 
rule (procedure). Writing a grammar as a program is not a 
new idea, the mos't notable prior example being Winograd's 
blocks world grammar (Winoqrad, 1973). The use of a 
procedural language allows complete freedom to embed 
arbitrarily complex information in the form of predicates 
and functions. Aopendix B gives an example of the 
orocedural specification ol a. grammar rule. 
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Re {3resentat ion of Inf <jrmat i.on in the Grammar 

•There axe two ways information is represented . for use 
i^n the grammar. One way is procedurally in the grammar 
rules themselves. This is the way most of the predictive 
information is encoded* For example, the grammar rule which 
corresponds to the concept of measuremeflft is written so that 
in parsing the ohrase "the voltage xxx" r it will only try 
the gram-nat rule for location to parse "xxx". 

The seconrl way information is represented is as data ^in 
the semantic notwoirk which contains all of the time 
invariant data in SOPHIE, (See Brown , 1974 for a pomplete 

^description of the semantic net,) For parsing, the semantic 
net is used to store information which links words to their 
possible concepts. For example , the information that Q4 is 

^ an instance of both the concepts of a TRANSISTOR and a PART 
is stojed in the net. 

An advantage of basing \be grammar on conceptual 
entities is that it eliminates the need for a separate 
semantic interpretation phase (i..e.r the syntactic 
description stands in a one-to-one relationship with the 
semantic description). Since each of the non-terminal 
categories in the. grammar is based on a semantic unit, each 
rule can determine the semantic description of the phrase 
that it recognizes in much the same way that a syntactic 
grammar determines a syntactic description. The low level 
rules build atomic "meanings" which, get combined into 
functions by the higher level rules. 

For example, the meaning of the phrase "Q5" is just 05. 
The meaning of the phrase "the collector of Q5" is 
(COLLECTOR Q5) where COLLECTOR is a function encoding the 
meaning of "collector". **The voltage at the collector of 
it" becomes 

(MEASURE VOLTAGE (COLLECTOR (PREF (TRANSISTOR) ) ) ) 
where MEASURE is the procedural " spec ial is t who knows about 
the concept of a "measurement". ' The. "meaning" of the 
pro.noun "it" phrase in the example is a call to the functidn 
PREF giving the oossible classes of "it".. PREF is also 
returned as t^e "meaning" of omitted phrases, i.e., there is 
no distinction made between something completely omitted and 
SQillgthirKj which left "it" behind when omitted* The function 
PRErtrtTn tains information about determining referents on the 
basis of context* The relationship between a ohrase and its 
meaning can be straightforward and, if the concepts and 
target semantics are well matched, usually is* However it 
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can^ qet complicated. ^ In the phrase "the b^e of Q5 shorted 
to the- emitter", the thinq - that is shorted is the 
base-emitter junction. Notice that the parser has some 
paraphrasing capabilities, as the "meaning" of the last 
phrasQ would be the same as "the base emitter shorted in 
Q5". 

The top level "meaning" is a complete progjram (function 
call) which when evaluated calculates an answer to the 
student's question or performs the student'^ command* ''The 
program is also used by the output generation routines to 
construct the appropriate phrasing of the response to the 
student . 

Det ermin ing Pronoun R^f^r^nts 

Once the parser has determined the existence of and the 
class (or set of classes) of a pronoun or omitted object, 
the context mechanism is invoked to determine the proper 
referent. This context mechanism has a history of student 
interactions during th^ current session which contains for 
each interaction the parse of the student's statement and 
the answer calculated by the system. To aid in the search 
of the "parses" on the history list, the context mechanism 
knows how each of the procedural specialists which can 
appear in a parse uses its arguments. For example, the 
specialist MEASURE has a first argument which mOst be a 
quantity and a second argument which mu'st be a ^art, a 
junction, a section, a terminal or a node> When the context 
mechanism finds a match between a possible class of the 
pronoun and one of the argument positions of a specialist in 
a previous parse, the object in this argument p^*ition is 
checked. It this object is a member of one of the allowed 
pronoun classes, it is taken as ^ the referent* • The 
significance of checking the specialist during the search 
instead of just using the first object which satisfies the 
pronoun's requirements is that )l avoids mis-interpretations 
due to object-concept ambigu*ity. For example, the object Q2 
is both a part and a transistor. If the context mechanism 
is looking for a .part, Q2 will be Cound Only in those 
sentences in which it is used as a part and not in those in 
which it is uspd as a transistor. In this way the context 
mechanism finds the most recent occurrence of an object 
being used as a member o'f one of the desired classes. 

Determining Elliptic R^f^£^nts' 

If the problem of pronoun resolution is looked upon as 
finding a* previously mentioned object for a currently 
specified use, thp problem of ellipsis can be thought of as 
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finding a previously mentioned use for a currently specified 
object.' For example: -v 

(5a) What is the base current of Q4? 

(5b) In Q5? 

The given argument is "Q5" and the earlier function is "base 
current". This basic approach to ellipsis work^ 
surprisingly well. For a given elliptic phrase^ the 
semantic grammar identifies the concept (or ^ concepts) 
involved. In (Sb) r this would be TRANSISTOR.- The context 
mechanism then searches the history list for a function in a 
previous parse which accepts the given class as an argument. 
When one is found, the new phrase is substituted into the 
proper argument position and the substituted meaning is* used 
as the meaning of the ellipsis. 

^uzziness * ^ 

Having the grammar centered around semantic categories 
allows the parser to be sloppy about the, pctual words- it 
finds in the statement. The parser is willin'g to ignore 
some words trying to understand a statement. 'The amqunt of 
sloppiness (i.e. the number of words in a row -which can be 
ignored) is controlled in' two - v/ays. First, whenever a 
grammar rule is invoked, the. calling rule has the option of 
limiting the number of words that can be skippec'. Second,, 
each^rule-can decide 'which '^of its constituent pieces or 
words are reouired afid how fuzzy the search for them can be. 
Taken tbgether, these controls have the effect *that the 
normal mode of operation of the parser is tight in the 
beginning of a sentence -but more fuzzy after it has made 
sense out of something. ^ ^ ' 

P£escanning^ * 

Before a statement is given to the' parser, three 
operations are performed on the statement by a 
ore-processor. The first operation is the expansion of any 
abbreviations which occur in the statement. 'The^^second 
^operation is a cursory spelling correctiwi. The third 
operation is a reduction of compound words. * 

Spelling correction is attemoted on aay word of th*e 
input string whidh the system does not recognize. The 
spelling correction -algorithm* takes the (possibly) 
misspelled word and a list of correctly spelled words and. 
determines which (if any) of the correct words is clos^ to 



*SOPHIE uses the spelling correction routines developed by 
Teitelman for use in the DWIM facility of INTERLISP 
(Teitelman,* 1974) . 
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the misspelle/l word (using a metric determined by number of 
tc.in:::oosi t ions, doubled letters, dropped letters, etc.). 
During the prescan, the list of correct words is very s,mall 
(approximately a dozen) and is limited to very commonly 
n>isspelled words and/or words which are critical to the 
understandinq of a sentence. The list is kept small so that 
the time spent ""*:emDtinq spelling correction, orior to 
ottemotinq a o^rse, is keot to a minimum. Remember that the 
oarsor has the ability to'iqnore words in the inout strinq 
so we do not want to spend a lot of time cor.rectinq a word 
which won-'t he needed in under standinq the statement^ But 
not'ice that certain words can'.be critical to the correct 
undoratandin^ of a statement. For example,, supoose that the 
Dhra'se "the base emitter current of 03" was incorr^ctlv 
tyoed OS "tha b?e emitter current of Q3" . I f" '"'bse" were not 
roc-)qni2ed as beinq "base" the oarser would ignore it and 
(nis-) understand the chr*ase as "the emitter current of 03'*r 
a oerfectly acceptable but much different concept.* Because 
of this oroblpm^ words like "base", are considered critical 
nni their ?^pellinq is corrected before any parse is 
atteiipted. N^ote that there are a lot of words (e.q., 
"c^nacitor", "reolace", "ooen", etc.). which if missoelled 
woull nrovont ,the parser from jjiak inq ^sense of the -statenient 
but would not lead to any mis-understandings. These words 
ire thort^fore not considered to fee "critical" and would be ^ 
C'Urect^^d in the second attemot at spelling correction which 
don*^ oft<^r a statement fails to parse. • 

Co-^noun'] words are single conceots v;hich appear in the 
"urf TOO s^'''^cturo 3?/ci fixed series pf more than '^ne word. 
V^cir rcJuclion is verir imoortant to the efficient operation 
of thr^ mcc-pc. For px.amole, in the Question "what is the 
v(>U3i<^ rr^nic switch setting?", "voltage range switch" is 
r^-vritfon as "V!<". if not rewritten, "voltage" would be 
n^i<^,tikon as toe beninninq of a measuremeat (as in "what -is 
th^^ voltaie at \M") anr' an attemot would have to be made to 
vnr'^o "rnnio switch setting" as a, olacc to mcas re .voltaae. 
'M" courr.o after this failed, the correct oarse can still be 
f.Mirn hut' LodMcinq comoound 'words helos to avoid 
^>jckt r^rk i m . In addition, reduction of compound words 
* J T^^'l i f i^'S the irammar rules bv allowing them to work with 
liri-r conopniual units. In thi<s sense, the prescan can be 
VI- wo ] ir, 1 orelioniaarv bottom-uo parse that rpcoonize^ 
1 ' 1 , Tnlt 1-word conceots. 



*i ^ ^ I n I th* conseouences of such mi s-internretat ion , 
* -i* "v't^ ': Mv;r>vS' rosoonds with an answer which inlicatos 
ft 1 >r it is answerin^^, rather than 1uJ^ giving the 

rsf * r I in"W'"r , • "^"^ 
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Conclusions 

As stated in the beqinninq, our two qoals for^ SOPHIE s » 
natural^; — tanquage processor are efficiency and friendliness. 
In terms of efficiency, the parser has succeeded admirably-. 
The grammar is written in LISP which can be block compiled. 
Osinq this technique, the complete parser takes up about 5k 
of storage and parses a typical student statement consisting 
of 8 to 12 words in around 150 milliseconds! Appendix C 
provides exact timings for some of the statements in the 
dialogue. 



Our goal of friendliness is much harder to measure 
since the only truly meaningful evaluation must be made when 
students begin using SOPHIE in the classroonv. However, our 
resulLs 9*0 far have been encouraging. The sysdem has been 
used in hundreds of hours of tests by people involved in^the 
SOPHIE project. In addition, about a dozen different people 
have had realistic sessions (as opposed to demonstrations) 
with SOPHIE . and the parser. was able to handle most of the 
questions which were asked. Anytime^ a statement is not 
•accepted by the parser, it is saved on a disk file. This 
information is constantly used to update and extend the 
.qrammar. <c 

The most problematic asoect, of the natural language 
processor is its cj,fenerality and extensibility. The approach 
taken/to its development has been evolutionary: Add a new 
consyVuct; see what other constructs it interferes with and 
whavnew statements it encourages students to use; and 
extend ^the oarser to handle these. From the* first sentence 
ever parsed, we were aware of the possibility that the next 
construct we wanted to add might be the "last straw" which 
forced us into a* fundamentally different approach. However, 
we have not yet reached that limit. The results so far are 
sufficiently impressive that we feel it worthwhile to find 
out more about the limits of this technique. The niost 
unnatural limitation in the grammar right now is the lack of 
conjunction. While incorporating conjunction will almost 
certainly require the addition of an interrupt mechanism 
similar to that allowed in PROGRAHMAR (Winograd, 1973), this 
is possible within the present framework. In fact, we 
believe that the semantic nature of our non-terminal 
categor ior> and the predictive abil i ty of the semant ic 
grammar shouTd provide a good handle on computational 
explosion normally accomoanying conjunction. In any event, 
we' believe that SOPHI!:: has demonstrated the feasibility of 
using natural language in mixed-infNt iativo CAI systems. 
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This appendix gives a formal description of the 
language accepted by SOPHIE. The grammar is implemented as' 
LJSP functions and some examples are listed in Appendix C. 
The parsing process is sketched out and a list of compound 
words and abbreviations are given. 

^ In the description, alternatives on the right-hand side 
are separated by ! or are listed on separate lines. 
Brackets {] enclose optional elements. An asterisk * is 
used to mark notes about a particular rule. Non-terminals 
are designated by-names enclosed in angle brackets <>. 

<circui^/place>:= <terminal> I <node> 

<diode/spec> :="<diode> 1 <zener/diode> 

<section> diode I <section> zener/diode 

<junction> := < j unct ion/type> [of] < transistor/spec> 

<transistor/term/type> and < transistor/term/type> (of] 

(<transistor/spec>J 
<transi^tor/term/type> to <transistor/tern;/type> (ofj 
{<\transistor/spec>J * 

< junction/type> := eb ! be ! ec ! ce ! cb ! be 

<meas/quant> := voltage 1 current I resistance* ! power 
*means measured resistance 

<measiirement> := <section> {output*] [<meas/quant>] 
output*<meas/quant> [of ] <section> 
output* (<meas/quant>} [of <transformer>] 
< tr ansformer> <meas/quant> 

<meas/quant> between** <circuit/place> and* ' 

<circuit/place> 
<meas/quant> of*** Cpart/spec> 
<meas/quant> between output terminals 
<meas/quant> of <junction> 
<meas/quant> of <circui t/place> 
<meas/quant> from <juncticn> 
<meas/quant> of <sGction> 
<meas/quant> of <pronoun> 

< junction/type> <meas/quant> (of < transistor/speo J 
<transistor/term/type> <meas/quant> of 

[< transistor/speo J 
♦input also 
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from-to also works 

*at, thru, in, into, across and through also work 



<node> := junction of <part/spec> and <part/spec> 
node betwe'en <section> and <section> 
[point] between <part/spec> and <part/spec> 
<node/name> ! [node]" <node/number > 
, <pronoun> 

<part/spec> := <part/name> ! <load/spec> ! <section> <part/type> 
<pronoun> 

<pot/spec> := cc ! vc 1 cct 

<pronoun>:= it ! (that] "type* 

<terminal> := output (terminal] ! <transistor/term> ! center/tap* 
positive terminal [<par t/slpec>] I positive one 
negative terminal ( <par t/speoj ! negative one 
anode (<diode/spec> ] ! cathode f <diode/spec>] 
wiper (<pot/spec>] 

<transistor/spec> := <transistor> ! <section> ^transistor ! <pronoun> 

<transistor/term> := < transistor/term/type> ( <tr ansistor/spec>) . 

< transistor/ term/type> := base ! collector emitter 

<transistor> f <capacitor>, <diode>, <resistor>f <transformer > and 
<zener/diode> all check the semantic network and parse correct part 
names, e.g., R9, Q6. 

<sectiO|n> uses the semantic network to determine if a word is a 
secti<>fi of the unit, e.g., current/limiter . 

<part/name> uses the semantic network to see if a word is the name of 
a part e.g., C4, T2. 

<node/name> checks semantic network for node names. 



APPENDIX B 
,A Rule from the Grammar 



The grammar is encoded as LISP proc?edures. The ways of 
expressing a non-terminal are embodied in 'a grammar 
function. Each grammar function takes at least two 
arguments; STR, the list of words to be recognized, and N, 
the degree of fuzziness allowed.. This function, in effect, 
must determine whether the beginning of the string STR 
contains an occurrence of the corresponding non-terminal. 
There are generally two types of checks that a grammar 
function performs. One is a check for the occurrence pf a 
word or words which satisfies certain predicates. This 
checking is done with two functions — CHECKLST and 
CHECKSTR. CHECKLST looks for a word in the string matching 
any of a list of words. CHECKSTR looks for a word in the 
string satisfying, an arbitrary predicate. It is through 
these functions that the parser implements its fuzziness. 
For example, if CHECKSTR is called with the string "resistor 
R9" and a predicate which determines if a word is the name 
of a part (e.g., "R9"), CHECKSTR will succeed. by skipping 
the word "resistor", which in this phrase is a noise word. 

The other usual type of operation performed by the 
grammar functions is to check for the occurrence of 'other 
non-t.erminals. This is done *by calling the proper function 
(grammar rule) and passing it the correct position in the 
input string. 

If a grammar rule is s-uccessful, the function passes 
back two pieces of information. E'irst, it returns some 
^indication of how much of the input string has been accepted 
(i.e., where it stopped). The convention adopted is that 
the grammar rule returns as its value a pointer to the last 
word in the- string accepted by the ■ rule. Second, the 
function passes back structural description of the' phrase 
that was parsed. This structure is passed back in the free 
variable RESULT (analogous to an'ATN's "*" uoon return from 
a PUSH (Woods, 1973) ) . 

Listed below is the grammar rule for the concept of a 
junction of a transistor. This rule accents ohrases such as 
"base emitter junction of 05", "BE of the current * limiting 
transistor", or "collector emitter junction". 
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' {<JUNCTION> 

(LAMBDA (STR N) * 
(PROG (TSl Rip . " 
(RETURN 
(AND 
(* comment A) 

[OR (AND (SETQ TSl ( < JUNCTION/TYPE> STR N) ) 
(SETQ Rl RESULT) ) 
(AND (SETQ TSl ( <TRANSISTOR/TERM/TyPE> STR N) ) 
(SETQ Rl RESULT) 
[SET0-*STS52^ 

(<TRANSISTOR/TERM/TyPE> 
(CDR (CHECKLST (CDR TSl) 

(QUOTE (AND- TO] 
(SETQ Rl (JUNCTION-OF-TERMS Rl RESUtT] 
(* comment B) / 
"(COND . / 

((SETQ STR (<TRANSISTOR/SPEC> 

(CDR (GOBBLE (GOBBLE TSl (QUOTE (JUNCTION))) 
(QUOTE (OF) ) 
1) 

(SETQ RESULT (L€ST Rl RESULT)) 
STR) / 
((SETQ RESULT (LIST Rl (LISt/(QUOTE PREF) 



TSl)) 



(QUOTE (TRANSISTOR] 



/ f 
Comment A: 

The first thing that is looked for is either a 
<junction/type> (BE, emitter collector, etc.) or two 
<transistor/terminal/type>s (base, emitter or collector) 
separated by the words "and- or "to". If two terminals are 
found, the function JUNCTION-OF-TERMS is called to determine 
the correct'^ .junction. In either case, the place where the 
successful subsidiary rule left off is saved in TSl and the 
meaning of the accepted phrase is saved in Rl. 



Comment B: 

|The next thing a junction needs is a transistor 
(<TRANSISTOR/SPEC>) . <TRANSISTOR/SPEC> looks for an 
occurrence of a transistor e.g. "Q5" or "current limiting 
transistor-. GOBBLE is a function for skipping relational 
words when they are not used to restrict the remaining part 
of the phrase. If. a transistor is not found, a deletion is 
hypothesized and a call to PREF is constructed. If the 



transistor has been pronominalized as in "the base emitter 
of it", <TRANSISTOR/SPEC> would recognize "it". In either 
case the semantics of the recognized phrase (something like 
(EB Q5)) is put into RESULT and a^ pointer to the / last 
recognized word is returned as the value of <JUNCTION>. 

At present there' are approximately 80 grammar rules in 
Sophie's grammar. 
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APPENDIX C 
Sample Parses and Parse Times 



The following a^re examples of statements handled by the 
natural language processor. Under each statement, the 
semantic interpretation returned by the parser is given. 
The semantic interpretation is .a function call which when 
evaluated performs the processing required by the statement. 
Parse times are given in milliseconds* 

<i 

Insert a fault. 

(INSERTFAULT NIL) (85 ms) 

What Is the output voltage " 
(MEASURE VOLTAGE NIL OUTPUT) ^ (80ms) 

What is the voltage between the current limiting transistor and 
the constant current source. 

(MEASURE VOLTAGE* (NODE/BETWEEN (FINDPART CURRENT/LIMITER/ 
TRANSISTOR) 
CURRENT/SOURpE) ) (335 ms) 

What is the voltage between there and the base of Q6? 
(MEASURE VOLTAGE (PREF (NODE TERMINAL)) (BASE Q6)) (100 ms) 

Q5? 

(REFERENCE ((TRANSISTOR) Q5)) (95 ms) 

Could the problem be that Q5 is bad? 
(TESTFAULT Q5 BAD) (100 ms) 

Could it be shorted? 

(TESTFAULT (PREF (PART. JUNCTION TERMINAL) ) SHORT) (75 ms) 
If r8 were 30k what would the output voltage be? 

(IFTHEN (R8 30000.0 VALUE) (MEASURE VOLTAGE NIL OUTPUT)) (220 ms) 

. If C2 were leaky what would the voltage across it be? (120 ms) 
(IFTHEN (C2 LEAKY) (MEASURE VOLTAGE (PREF (PART JUNCTION)) NIL)) 

What is the output voltage when the voltage control is set to .5 
(RESETCONTROL (STQ VC .5) (MEASURE VOLTAGE NIL OUTPUT)) (170 ms) 

What is it with it set at .6? 

(RES£T CONTROL (STQ (PREF (POT LOAD SWITCH)) .6) (REFERENCE NIL)) 
(110 ms) 
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If it is set to .9? 

(RESETCONTROL (STQ (PREF (POT LOAD SWITCH)) .9) (REFERENCE NIL)) 
(135 ms) 

• • « 

What is the current thru the CC when the VC is set to 1.0 
(RESETCONTROL (STQ VC 1.0) MEASURE CURRENT CC NIL)) (190 ms) 




